117 research outputs found

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Towards Terascale Knowledge Acquisition

    Get PDF
    Although vast amounts of textual data are freely available, many NLP algorithms exploit only a minute percentage of it. In this paper, we study the challenges of working at the terascale. We present an algorithm, designed for the terascale, for mining is-a relations that achieves similar performance to a state-of-the-art linguistically-rich method. We focus on the accuracy of these two systems as a function of processing time and corpus size.

    Optimization of electricity / hydrogen cogeneration from generation IV nuclear energy systems

    Get PDF
    One of the great motivations of studying and developing Generation IV (Gen IV) reactors of VHTR (Very High Temperature Reactor) design concept is their capacity to efficiently produce both electricity and H2 (hydrogen). This study aims at developing an optimization methodology for cogeneration systems of H2 and electricity, from Gen IV nuclear reactors, with respect to energy constraints, economics and conjuncture in term of demand. It lies within a scope of a collaboration between the Laboratoire de Génie Chimique (Toulouse, France) and the Commissariat à l’Energie Atomique (CEA, Cadarache, France) in order to compare various cogeneration systems from both energy and economics viewpoint. This paper presents the results of an optimization study based on the “minimal destruction of exergy” or “exergy loss” concept. This criterion, used within the framework of a mono-objective genetic algorithm optimizer, was applied successfully to electric and heat production from Gen IV systems

    Semantic lexicon adaptation for use in query interpretation

    Full text link
    We describe improvements to the use of semantic lexicons by a state-of-the-art query interpretation system powering a major search engine. We successfully compute concept la-bel importance information for lexicon strings; lexicon aug-mentation with such information leads to a 6.4 % precision increase on affected queries with no query coverage loss. Fi-nally, lexicon filtering based on label importance leads to a 13 % precision increase, but at the expense of query cover-age

    Impact of pre-hospital handling and initial time to cranial computed tomography on outcome in aneurysmal subarachnoid hemorrhage patients with out-of-hospital sudden cardiac arrest—a retrospective bi-centric study

    Get PDF
    BackgroundAneurysmal subarachnoid hemorrhage (SAH) presents occasionally with cardiac arrest (CA). The impact of pre-hospital and emergency room (ER) treatment on outcome remains unclear. Therefore, we investigated the impact of pre-hospital treatment, focusing on lay cardiopulmonary resuscitation (CPR), and ER handling on the outcome of SAH patients with out-of-hospital CA (OHCA).MethodsIn this bi-centric retrospective analysis, we reviewed SAH databases for OHCA and CPR from January 2011 to June 2021. Patients were analyzed for general clinical and epidemiological parameters. CPR data were obtained from ambulance reports and information on ER handling from the medical records. Data were correlated with patient survival at hospital discharge as a predefined outcome parameter.ResultsOf 1,120 patients with SAH, 45 (4.0%) were identified with OHCA and CPR, 38 of whom provided all required information and were included in this study. Time to resuscitation was significantly shorter with lay resuscitation (5.3 ± 5.2 min vs. 0.3 ± 1.2 min, p = 0.003). Nineteen patients were not initially scheduled for cranial computed tomography (CCT), resulting in a significantly longer time interval to first CCT (mean ± SD: 154 ± 217 min vs. 40 ± 23 min; p < 0.001). Overall survival to discharge was 31.6%. Pre-hospital lay CPR was not associated with higher survival (p = 0.632). However, we observed a shorter time to first CCT in surviving patients (p = 0.065)ConclusionsOHCA in SAH patients is not uncommon. Besides high-quality CPR, time to diagnosis of SAH appears to play an important role. We therefore recommend considering CCT diagnostics as part of the diagnostic algorithm in patients with OHCA

    A genome-wide association study of anorexia nervosa suggests a risk locus implicated in dysregulated leptin signaling

    Get PDF
    J. Kaprio, A. Palotie, A. Raevuori-Helkamaa ja S. Ripatti ovat työryhmän Eating Disorders Working Group of the Psychiatric Genomics Consortium jäseniä. Erratum in: Sci Rep. 2017 Aug 21;7(1):8379, doi: 10.1038/s41598-017-06409-3We conducted a genome-wide association study (GWAS) of anorexia nervosa (AN) using a stringently defined phenotype. Analysis of phenotypic variability led to the identification of a specific genetic risk factor that approached genome-wide significance (rs929626 in EBF1 (Early B-Cell Factor 1); P = 2.04 x 10(-7); OR = 0.7; 95% confidence interval (CI) = 0.61-0.8) with independent replication (P = 0.04), suggesting a variant-mediated dysregulation of leptin signaling may play a role in AN. Multiple SNPs in LD with the variant support the nominal association. This demonstrates that although the clinical and etiologic heterogeneity of AN is universally recognized, further careful sub-typing of cases may provide more precise genomic signals. In this study, through a refinement of the phenotype spectrum of AN, we present a replicable GWAS signal that is nominally associated with AN, highlighting a potentially important candidate locus for further investigation.Peer reviewe

    GestaltMatcher Database - A global reference for facial phenotypic variability in rare human diseases

    Get PDF
    The most important factor that complicates the work of dysmorphologists is the significant phenotypic variability of the human face. Next-Generation Phenotyping (NGP) tools that assist clinicians with recognizing characteristic syndromic patterns are particularly challenged when confronted with patients from populations different from their training data. To that end, we systematically analyzed the impact of genetic ancestry on facial dysmorphism. For that purpose, we established the GestaltMatcher Database (GMDB) as a reference dataset for medical images of patients with rare genetic disorders from around the world. We collected 10,980 frontal facial images - more than a quarter previously unpublished - from 8,346 patients, representing 581 rare disorders. Although the predominant ancestry is still European (67%), data from underrepresented populations have been increased considerably via global collaborations (19% Asian and 7% African). This includes previously unpublished reports for more than 40% of the African patients. The NGP analysis on this diverse dataset revealed characteristic performance differences depending on the composition of training and test sets corresponding to genetic relatedness. For clinical use of NGP, incorporating non-European patients resulted in a profound enhancement of GestaltMatcher performance. The top-5 accuracy rate increased by +11.29%. Importantly, this improvement in delineating the correct disorder from a facial portrait was achieved without decreasing the performance on European patients. By design, GMDB complies with the FAIR principles by rendering the curated medical data findable, accessible, interoperable, and reusable. This means GMDB can also serve as data for training and benchmarking. In summary, our study on facial dysmorphism on a global sample revealed a considerable cross ancestral phenotypic variability confounding NGP that should be counteracted by international efforts for increasing data diversity. GMDB will serve as a vital reference database for clinicians and a transparent training set for advancing NGP technology.</p
    • …
    corecore